A Contribution of Intrinsic Speech Variabilities to Errors Done by Speech Recognition

نویسنده

Miloš Cerňak

چکیده

A usual way of ASR accuracy evaluation is calculation of Word Error Rate (WER) and Sentence Error Rate (SER). The misrecognitions that contribute to WER are classified into three categories: deletions, insertions and substitutions. The paper presents a study about a contribution of intrinsic speech variabilities to the each of the error category. Decision tree (DT) analysis is used. Five DT styles are examined: CART, C4.5, and then Minimum Message Length (MML), strict MML and Bayesian styles decision trees. We apply these techniques to data of the computer speech recognition fed by intrinsically variable speech.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Speech recognition is a subfield of artificial intelligence that develops technologies to convert speech utterance into transcription. So far, various methods such as hidden Markov models and artificial neural networks have been used to develop speech recognition systems. In most of these systems, the speech signal frames are processed uniformly, while the information is not evenly distributed ...

متن کامل

Oldenburg logatome speech corpus (OLLO) for speech recognition experiments with humans and machines

This paper introduces the new OLdenburg LOgatome speech corpus (OLLO) and outlines design considerations during its creation. OLLO is distinct from previous ASR corpora as it specifically targets (1) the fair comparison between human and machine speech recognition performance, and (2) the realistic representation of intrinsic variabilities in speech that are significant for automatic speech rec...

متن کامل

Emotional Aspects of Intrinsic Speech Variabilities in Automatic Speech Recognition

We analyze two German databases: the OLLO database [1] designed for doing speech recognition experiments on speech variabilities, and the Berlin emotional database [2] designed for the analysis and synthesis of emotional speech. The paper tries to find a relation between intrinsic speech variabilities and the emotions. Moreover, we study this relation from the point of view of speech recognitio...

متن کامل

Complementarity of MFCC, PLP and Gabor features in the presence of speech-intrinsic variabilities

In this study, the effect of speech-intrinsic variabilities such as speaking rate, effort and speaking style on automatic speech recognition (ASR) is investigated. We analyze the influence of such variabilities as well as extrinsic factors (i.e., additive noise) on the most common features in ASR (mel-frequency cepstral coefficients and perceptual linear prediction features) and spectro-tempora...

متن کامل

روشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه

Performance of speech recognition systems is greatly reduced when speech corrupted by noise. One common method for robust speech recognition systems is missing feature methods. In this way, the components in time - frequency representation of signal (Spectrogram) that present low signal to noise ratio (SNR), are tagged as missing and deleted then replaced by remained components and statistical ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

A Contribution of Intrinsic Speech Variabilities to Errors Done by Speech Recognition

نویسنده

چکیده

منابع مشابه

Persian Phone Recognition Using Acoustic Landmarks and Neural Network-based variability compensation methods

Oldenburg logatome speech corpus (OLLO) for speech recognition experiments with humans and machines

Emotional Aspects of Intrinsic Speech Variabilities in Automatic Speech Recognition

Complementarity of MFCC, PLP and Gabor features in the presence of speech-intrinsic variabilities

روشی جدید در بازشناسی مقاوم گفتار مبتنی بر دادگان مفقود با استفاده از شبکه عصبی دوسویه

عنوان ژورنال:

اشتراک گذاری